8. Content Filtering
The Content Filter agent uses SmartScreen technology to analyze the content of every message and evaluate whether it is spam.
SmartScreen technology is the name for the underlying content
filtering core technology developed by Microsoft Research and used in
multiple Microsoft products under various names. In Exchange it is used
under the name IMF (Intelligent Message Filter), whereas Outlook uses SmartScreen in its Junk E-mail Filter. The same core engine is used in Hotmail as well. The content filter uses the Microsoft Exchange Anti-Spam Update service to update its filters.
After the message is received, the Content
Filter agent evaluates the message's content for recognizable patterns
and then assigns a rating based on the probability that the message is
spam. This rating is attached to the message as an SCL, which is a
numerical value between -1 and 9. Table 3 provides an overview of the SCL ratings and the spam confidence level definition.
Table 3. SCL Ratings and Definitions
SCL RATING | SPAM CONFIDENCE LEVEL DEFINITION |
---|
–1 | Messages are from a trusted source (internal, authenticated, or safelisted). |
0 | Messages are categorized as not spam. |
1 – 4 | The likelihood of being spam is extremely low to low. |
5 – 9 | The likelihood of being spam is high to extremely high. |
The Content Filter agent
scans messages only on Edge Transport servers and Hub Transport servers
that have the anti-spam agents installed.
The SCL rating of the message is stored in the header of the message and can be found by identifying the X-MS-Exchange-Organization-SCL value, as displayed in Figure 5.
Content filtering is
enabled by default on Edge Transport servers (or Hub Transports that
have the anti-spam agents installed) and is configured to reject all
messages with an SCL rating equal or greater than seven.
8.1. Configuring the Content Filter
You can modify the default content filtering settings by using the Exchange Management Console or the Exchange Management Shell:
Configure Custom Words You can specify a list of keywords
or phrases to prevent the blocking of any message containing those
words. This feature is useful if your organization must receive e-mail
that contains words that the Content Filter agent normally would block. You also can specify keywords or phrases that cause the Content Filter agent to block a message containing those words.
Specify Exceptions You can configure exceptions to exclude any messages to recipients on the exceptions list from content filtering.
Specify Actions You can configure the SCL thresholds and threshold actions. You can configure the Content
Filter agent to delete, reject, or quarantine messages with an SCL
rating equal or greater than the value you specify. You can also define
the quarantine mailbox.
Note: When
the Content Filter agent rejects a message, it uses the default
response of "550 5.7.1 Message rejected due to content restrictions".
You can customize this message using the Set-ContentFilterConfig cmdlet in the Exchange Management Shell.
8.2. Configure SCL Junk E-Mail Folder Threshold
Besides configuring the
Content Filter to reject, quarantine, or delete messages on the
incoming stream, you can also configure SCL threshold levels to move
mail items automatically to the Junk folder of the mailbox. You can
configure this either on a mailbox level or an organizational level.
Note: The SCL Junk
E-mail threshold configuration means that Exchange automatically moves
the messages to the Junk folder. If the messaging client of your users
does not allow the Junk folder to be accessed, users will never be
aware of the junk mail. This is not a problem if your users are using
Outlook or OWA. For POP3 clients, if you turn on the SCL thresholds,
recipients won't be able to see mail sent to the Junk E-mail folder (as the POP3 protocol doesn't know the Junk E-mail folder), and IMAP4 clients should include the Junk E-mail folder in the list of subscribed folders for their client.
8.2.1. SCL Configuration on a Mailbox Level
To configure specific junk mail parameters on a single mailbox, you need to use the Set-Mailbox cmdlet. This cmdlet includes various parameters that allow you to configure SCL thresholds and their actions. Table 4 provides an overview of which SCL thresholds can be configured.
Table 4. Set-Mailbox Cmdlet Anti-Spam Parameter Overview
SET-MAILBOX PARAMETERS | RESULT |
---|
AntispamBypassEnabled | If set to $true, no anti-spam agent will scan a message from or to this mailbox. |
RequireSenderAuthenticationEnabled | If this is configured, any sender that addresses this mailbox must be authenticated. |
SCLDeleteEnabled
SCLDeleteThreshold <SCL#> | Defines the SCL threshold for the mailbox that deletes any messages rated equal or above the threshold. |
SCLJunkEnabled
SCLJunkThreshold | Defines the SCL threshold for the mailbox that moves any message rated equal or above the threshold to the Junk folder. When SCLJunkEnabled is set to $True and SCLJunkThreshold is a value such as 5, the setting ignores the organizational setting. |
SCLQuarantineEnabledSCLQuarantineThreshold
| Defines the SCL threshold for the mailbox that moves any message rated equal or above the threshold to the quarantine mailbox. |
SCLRejectEnabled
SCLRejectThreshold | Defines the SCL threshold for the mailbox that rejects any messages rated equal or above the threshold. |
8.2.2. SCL Configuration on an Organizational Level
You can also define an SCL threshold level for every mailbox in your Exchange organization by using the following cmdlet:
Set-OrganizationConfig -SCLJunkThreshold <SCL #>
If the SCL rating for a specific message exceeds the SCL Junk E-mail folder threshold, the Mailbox server moves the message to the user's Junk E-mail folder. The default value is 4.
Andreas Bode
Messaging Consultant, Siemens AG, Germany
One of my customers
wanted to identify SPAM by adding the word SPAM: to the message subject
as a prefix. Using Transport rules it is very easy to add the word
SPAM: as the prefix for the SCL rating I defined. However, I recognized
that the rule did not catch the SCL rating at all.
After some research I found
out that the problem was caused by the priority at which the Transport
agents are processed. By default, Edge Rule Agent has a priority of 3
and Content
Filter Agent of 4. Practically, this means that the rule is executed
before the Content Filter agent sets an SCL rating on the message.
To solve this problem I executed the following cmdlet to move the Content Filter Agent in priority before the Edge Rule Agent: Set-TransportAgent "Content Filter Agent" –Priority 3
cmdlet. After a Transport service restart, my Transport rule marked all
messages that exceeded the SCL threshold I defined accordingly.
|
8.3. Safe List and Block List Aggregation
In Exchange Server 2010, the Content Filter agent on the Edge Transport server uses the Microsoft Outlook Safe Senders lists, Safe Recipients lists, and trusted
contacts to optimize spam filtering. Safelist aggregation is a set of
anti-spam functionality that Outlook and Exchange Server 2010 share.
This anti-spam functionality collects data from the anti-spam safe
lists that Microsoft Outlook or OWA users configure, and makes this
data available to the anti-spam agents on the Edge Transport server.
Unlike in Exchange 2007, in
Exchange 2010 safelist aggregation is enabled by default. The Mailbox
assistant does the safelist aggregation automatically. Another
difference between Exchange 2007 and Exchange 2010 servers is Block
list aggregation. In Exchange 2007 the Block list logic was triggered
at the client end but in Exchange 2010 server the logic is transferred
to the Sender filter and executed much earlier in the anti-spam
processing. So in addition to existing Safe
Sender/Recipient lists aggregation Exchange 2010 server also aggregates
Block list entries, making them available to Sender filtering.
The safelist collection
is stored in a hidden item in the root folder of the user's mailbox. A
user can have up to 1,024 unique entries in a safelist collection.
Exchange 2010 and the Junk E-mail Options mailbox assistant then
replicate the changes from the mailbox to the user's Active Directory
account (namely to the msExchSafeSenderHash, msExchSafeRecipientHash, and msExchBlockedSendersHash attributes).
EdgeSync then synchronizes the safelist collections to the Edge Transport servers where the aggregated data is used by the Content Filtering agent.
Unlike Exchange 2007, you do not need to run Update-Safelist cmdlet to gather and prepare safelist data from user mailboxes anymore. However, you can use the Update-Safelist cmdlet to manually update the safelist in Active Directory.
You can find more details at http://technet.microsoft.com/en-us/library/bb125168.aspx.
9. Sender Reputation Filtering
The Exchange Server 2010 Sender
Reputation feature makes message-filtering decisions based on
information about recent e-mail messages received from specific
senders. The Sender Reputation agent analyzes various properties about
the sender and the e-mail message, to create a Sender
Reputation Level (SRL). This SRL is a number between 0 and 9, where a
value of 0 indicates less than a 1 percent chance that the sender is
sending spam, and a
value of 9 indicates more than a 99 percent chance of it. If a sender
appears to be the spam source, the Sender Reputation agent
automatically adds the IP address for the SMTP server that is sending
the message to the IP Block list.
9.1. How Sender Reputation Filtering Works
When the Transport server
receives the first message from a specific sender, the SMTP sender is
assigned an SRL of 0. As more messages arrive from the same source, the
Sender Reputation agent evaluates the messages and begins to adjust the
sender's rating. The Sender Reputation agent uses the following
criteria to evaluate each sender:
Sender open proxy test
An open proxy is a proxy server that accepts connection requests from
any SMTP server, and then forwards messages as though they originated
from the local host. This also is known as an open relay server.
When the Sender Reputation agent calculates an SRL, it does so by
formatting an SMTP request in an attempt to connect back to the Edge
Transport server from the open proxy. If an SMTP request is received
from the proxy, the Sender Reputation agent verifies that the proxy is
an open proxy and updates that sender's open proxy test statistic.
HELO/EHLO analysis
The HELO and EHLO SMTP commands are intended to provide the receiving
server with the domain name, such as Contoso.com, or the IP address of
the sending SMTP server. Spammers frequently modify the HELO/EHLO
statement to use an IP address that does not match the IP address from
which the connection originated, or to use a domain name that is
different from the actual originating domain name. If the same sender
uses multiple domain names or IP addresses in the HELO or EHLO
commands, there is an increased chance that the sender is a spammer.
Reverse DNS lookup The Sender Reputation
agent also verifies that the originating IP address from which the
sender transmitted the message matches the registered domain name that
the sender submits in the HELO or EHLO SMTP command. The Sender Reputation agent performs a reverse
DNS query by submitting the originating IP address to DNS. If the
domain names do not match, the sender is more likely to be a spammer,
and the overall SRL rating for the sender is adjusted upward.
SCL ratings analysis on a particular sender's messages
When the Content Filter agent processes a message, it assigns an SCL
rating to the message. This rating is attached to the message as an
SCL. The Sender Reputation agent analyzes data about each sender's SCL ratings, and uses it to calculate SRL ratings.
The Sender
Reputation agent calculates the SRL for each unique sender over a
specific time. When the SRL rating exceeds the configured limit, the IP
address for the sending SMTP server is added to the IP Block list for a
specific amount of time.
9.2. Sender Reputation Configuration
You can configure the Sender
Reputation settings on the Edge Transport server. By using the Exchange
Management Console, you can configure the Sender Confidence (enable or
disable Sender open
proxy test), the Sender Reputation block threshold, and the timeout
period for how long a sender will remain on the IP Block list. Figure 6 shows the default settings of Sender Reputation filtering.
If you want to configure advanced settings, you need to use the Set-SenderReputation cmdlet, which allows you to fine-tune this feature.
10. Attachment Filtering
Attachment filtering
allows you to choose the attachment names, file extensions, or file
MIME content types your users can receive. This is necessary to protect
your users from malicious messages. One of the most famous viruses in
messaging history, the Melissa virus, was spread using a malicious
attachment. Obvious dangerous attachments such as scripts or
executables are now removed after causing a complete mail disruption
for large organizations years ago. You may remember the "I love you"
virus that sent messages to all contacts of the local mailbox. That's
just one good reason to consider attachment filtering!
Attachment filtering in Exchange 2010 can be based on the following filtering criteria:
You can use the Get-AttachmentFilterEntry cmdlet to display the currently configured attachment filters, as shown in Figure 7.
When a filtering criterion is met, the following actions can be performed on the message:
Strip attachment but deliver message
Block message and attachment
This blocks the message from entering the system but will inform the
sender of the message that the message contained an unacceptable
attachment.
Silently delete message and attachment This will delete the message before entering the system without sending any notification to the sender or recipient.
Attachment filtering is
only available on Edge Transport servers, not on Hub Transport servers,
even if you installed the anti-spam agents.
Note: The
Attachment Filter agent included with Exchange 2010 detects file types
even if they have been renamed. Attachment filtering also ensures that
compressed files (.zip or .lzh files) don't contain blocked attachments
by performing a filename extension match against the files in the
compressed files.
11. Anti-Spam Reporting
Exchange 2010 comes with a couple of scripts to create anti-spam activity reports that are available in the <Exchange_Install_Path>\Scripts folder. These scripts can run on Edge and Hub Transport servers. Table 5 provides an overview of available scripts and their usage.
Table 5. Anti-Spam Reporting Scripts
SCRIPT | PURPOSE |
---|
Get-AntispamFilteringReport.ps1 | Creates
a top-ten list of sources (such as agents) that are responsible for
either rejecting connections and commands or for rejecting, deleting,
or quarantining a message. |
Get-AntispamSCLHistogram.ps1 | Retrieves all entries for the Content Filter and groups them by SCL values. |
Get-AntispamTopBlockedSenderDomains.ps1 | Lists the top sender domains that were blocked by anti-spam agents. |
Get-AntispamTopBlockedSenderIPs.ps1 | Lists the top sender IPs that were blocked by anti-spam agents. |
Get-AntispamTopBlockedSenders.ps1 | Lists the top sender IPs that were blocked by anti-spam agents. |
Get-AntispamTopRBLProviders.ps1 | Lists the top reasons for rejection by Block List Providers. |
Get-AntispamTopRecipients.ps1 | Lists the top recipients that were rejected by anti-spam agents. |
All of these example Windows PowerShell scripts use the transport
agent logs and analyze them. All anti-spam agents log information about
connections and messages acted on, plus some contextual information
such as the name of the RBL or the SCL of the e-mail. The transport agent log files are located at <Exchange_Install_Path>\TransportRoles\Logs\AgentLog.
Jon Webster
Systems Engineer, Elephant Outlook, U.S. Southeast
Exchange 2010 and Forefront Protection 2010 for Exchange Server (FPE) both store their transaction and agent logs in CSV files.
The built-in Get-AgentLog
cmdlet first loads the entire set of CSV file(s) into Windows
PowerShell objects; then they can be filtered further. I have achieved
a significant performance boost using my Select-CSVString script
(available at http://poshcode.com/1649) by filtering the lines as text first, and then converting just those results to Windows PowerShell objects.
Let's assume the following: A
user on our system has opened a support ticket saying she was supposed
to get an e-mail from someone at fabrikam.com,
and it never arrived. Rather than getting the sender to locate the
bounce message, we can search the agent logs and see if something from
that domain was rejected within the last few days.
To look for this rejection using the built-in tool, I would use something like this:
Get-AgentLog |?{ ($_.p1fromaddress -match "fabrikam.com" -or
$_.p2fromaddresses -match "fabrikam.com") -and $_.action -eq
"RejectMessage"}
Get-AgentLog
loads several gigs of agent logs, filters through potentially hundreds
of thousands of results, and finishes several minutes later.
To look for the same line with Select-CSVString, I would use something like this:
Select-CSVString -pattern "fabrikam.com.*Reject"
Select-CSVString
searches several gigs of agent logs for the above regular expression,
converts just those results into Windows PowerShell objects, and return
similar results in under 30 seconds. I say similar, because Exchange
and Forefront can actually use multiple lines for a single message
(multiple recipients) that Select-CSVString doesn't handle—it could, but there just hasn't been much interest.
Couple that with another script to run Select-CSVString
against all edge servers and Hub Transport servers that run the
anti-spam agents using WinRM, and you can imagine the time it saves
your support staff.